RATIO-TYPE ESTIMATORS IN STRATIFIED RANDOM SAMPLING USING AUXILIARY ATTRIBUTE
INTRODUCTION
Prior knowledge about population mean along with coefficient of variation, kurtosis and correlation of the population of an auxiliary variable are known to be very useful particularly when the ratio, product and regression estimators are used for estimation of population mean of a variable of interest. The use of auxiliary information can increase the precision of an estimator when study variable is highly correlated with auxiliary variable. Srivastava and Jhajj (1981) suggested a class of estimators of the population mean, provided that the mean and variance of the auxiliary variable are known. Singh and Tailor (2003) considered a modified ratio estimator by exploiting the known value of correlation coefficient of the auxiliary variable. Singh and Upadhyaya (1999) suggested two ratio-type estimators when the coefficient of variation and kurtosis of the auxiliary variable are known.
However, the fact that the known population proportion of an attribute also provides similar type of information has not drawn as much attention. In several situations, instead of existence of auxiliary variables there exists some auxiliary attributes, which are highly correlated with study variable (Singh et. al.,2008). For example, sex and height of the persons, amount of milk produced by a particular breed of cow, amount of yield of wheat crop by a particular variety of wheat etc. (Jhajj et. al., 2006). In such situations, taking the advantage of point-biserial correlation between the study variable and the auxiliary attribute, the estimators of parameters of interest can be constructed by using prior knowledge of the parameters of auxiliary attribute. It is often useful to incorporate auxiliary information of the population in a sampling procedure. In practice, auxiliary information can be obtained in different ways. For example, the sampling frames often used in official statistics production may include auxiliary information on the population elements or these data are extracted from administrative registers and are merged with the sampling frame elements. In other words, aggregate-level of auxiliary information can be obtained from different sources, such as published official statistics. Use of auxiliary information in sampling and estimation can be very useful in the construction of an efficient sampling design. In the estimation of population parameters, auxiliary information is used to improve efficiency for the variable of interest. Whenever there is auxiliary information, the researcher wants to utilize it in the method of estimation to obtain the most efficient estimator.
In simple random sampling, the variance of the estimate (say, of population mean Y ) depends, apart from the sample size, on the variability of the character y in the population. If the population is very heterogeneous and considerations of cost limit the size of the sample, it may be found impossible to get a sufficiently precise estimate by taking a simple random sample from entire population. And populations encountered in practice are generally very heterogeneous (Raj and Chandhok, 1998). In surveys of manufacturing establishments, for example, it can be found that some establishment are very large, that is, they employ 1000 or more persons, but there are many others which have only two or three persons on their rolls. Any estimate made from a direct random sample taken from the totality of such establishments would be subject to exceedingly large sampling fluctuations. But suppose it is possible to divide this population into parts or strata on the basis of, say employment, thereby separating the very large ones, the medium-sized ones and the smaller ones. If a random of establishments is now taken from each stratum, it should be possible to make a better estimate of the strata average, which in turn should help in producing a better of the population average. Similarly, if a sample is selected with probability proportionate to x from the entire population, the variance of the population-total estimate may be very high because the ratio of y to x varies considerably over the population. If a way can be found of subdividing the population so that the variation of the ratio of y to x is considerably reduced within the subdivisions or strata, a better estimate of the population can be made. This is the basic consideration involved in the use of stratification for improving the precision of estimation (Raj and Chandhok, 1998).
1.2 CENSUS VERSUS SAMPLE SURVEY
Broadly speaking, information on population may be collected in two ways. Either every unit in the population is enumerated (called complete enumeration, or census) or enumeration is limited to only a part or sample selected from the population (called sample enumeration or sample survey). A sample survey will usually be less costly than a complete census because the expense of covering all units would be greater than that of covering only a sample fraction. Also, it will take less time to collect and process data from a sample than from a census. But economy is not the only consideration; the most important point is whether the accuracy of the results would be adequate for the end in view. It is a curious fact that the results from a carefully planned and well executed sample survey are expected to be more accurate (near to the aim of study) than those from a complete enumeration that can be taken. A complete census ordinarily requires a huge and unwieldy organization and therefore many types of errors creep in which cannot be controlled adequately. In a sample survey the volume of work is reduced considerably, and it becomes possible to employ persons of higher caliber, train them suitably, and supervise their work adequately. In a properly designed sample survey it is also possible to make a valid estimate of the margin of error and hence decide whether the results are sufficiently accurate. A complete census does not reveal by its self the margin of uncertainty to which it is subject. But there is not always a choice of one versus the other. For example, if the data are required for every small administrative area in a country, no sample survey of a reasonable size will be able to deliver the desired information; only a complete census can do this (Raj and Chandhok, 1998).